An expected average reward criterion

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy Decision Processes with an Average Reward Criterion

As the same framework of Fuzzy decision processes with the discounted case we will specify an average fuzzy criterion model and develop its optimization by “fuzzy max order” under appropriate conditions. The average reward is characterized, by introducing a relative value function, as a unique solution of the associated equation. Also we derive the optimality equation using the “vanishing disco...

متن کامل

Markov decision evolutionary games with time average expected fitness criterion

We present a class of evolutionary games involving large populations that have many pairwise interactions between randomly selected players. The fitness of a player depends not only on the actions chosen in the interaction but also on the individual state of the players. Players stay permanently in the system and participate infinitely often in local interactions with other randomly selected pl...

متن کامل

Bounded Parameter Markov Decision Processes with Average Reward Criterion

Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we pro...

متن کامل

An Average - Reward Reinforcement Learning

Recently, there has been growing interest in average-reward reinforcement learning (ARL), an undiscounted optimality framework that is applicable to many diierent control tasks. ARL seeks to compute gain-optimal control policies that maximize the expected payoo per step. However, gain-optimality has some intrinsic limitations as an optimality criterion, since for example, it cannot distinguish ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Stochastic Processes and their Applications

سال: 1987

ISSN: 0304-4149

DOI: 10.1016/0304-4149(87)90055-x